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Abstract. This paper continues the investigation of 'Wasserstein-like' transportation dis- 
tances for probability measures on discrete sets. We prove that the discrete transportation 
metrics on the d-dimensional discrete torus with mesh size -4 converge, when N — > oo, 
to the standard 2-Wasserstein distance on the continuous torus in the sense of Gromov- 
Hausdorff. This is the first result of a passage to the limit from a discrete transportation 
problem to a continuous one, and proves compatibility of the recently developed discrete 
metrics and the well-established 2-Wasserstein metric. 



1. Introduction 



Since the seminal work of Jordan, Kinderlehrer and Otto [12] it is known that the heat 
flow on R™ is the gradient flow of the entropy with respect to the Wasserstein distance 
W%- Subsequently, this interpretation has been extended to a wide class of spaces, including 
Riemannian manifolds [7], Finsler spaces [16|, Alexandrov spaces [11], Wiener spaces ||, and 
metric measure spaces [ 2 , ^] . 

By contrast, the corresponding result fails in a discrete setting, but nevertheless it has 
been shown recently [S, 13, 14 1 that the heat flow on a discrete space is the gradient flow of 



the entropy, if one replaces the Wasserstein distance by a different metric W. The key idea 
in order to define the metric W, in the spirit of the Benamou-Brenier formula 0], is to the 
minimize an action functional over curves in the spaces of probability measures, rather than 
minimizing a cost functionals over measures on the product space. An important ingredient 
in the definition is the logarithmic mean 9(s, t) = J s 1 "^ dp which is used to "average" 
probability densities at neighbouring points. 

In this paper we consider the space ^(T^) of probability measures on the torus T d := 
H d /Z d , endowed with the usual 2-Wasserstein metric Wjj. We also consider the d-dimensional 
periodic lattice T 1 ^ := (Z/NZi) d with mesh size i, and endow the space of probability 
measures ^"(T^) with its renormalized discrete transportation metric Wn as defined in |l3|] 
(see Section |2] below). 

The main result of this paper is the following theorem, which proves compatibility between 
the discrete theory and the continuous one. 

Theorem 1.1. Let d > 1. Then the metric spaces {&{T d N ), Wn) converge to (0 > (T d ),W2) 
in the sense of Gromov-Hausdorff as N — >■ oo. 



Loosely speaking, Gromov-Hausdorff convergence means that there exists a sequence of 

d \ 

N > 



mapping In '■ &(T d ) — > ^"(T^) which are "approximately isometric and surjective", up to 
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an error which vanishes as N — > oo. We refer to Definition |3.14] below for a formal definition. 
The mappings In that we shall use are of the form In{h) = T*n (H s (/i)), where Vn denotes 
the natural projection of ^(T rf ) onto ^(T^), and (Ht)t>o is the heat semigroup, run for a 
sufficiently small time s = s(N). 



The proof of Theorem 1.1 relies on two-sided bounds for W2 in terms of WV. In particular 



we shall prove a lower bound for W% of the form 

C(s) 



W;v(7MH s (/io)),"Piv(H s (/ii))) < W 2 ( f i , f i 1 ) + 

Interestingly, an upper bound for Wi can be readily obtained in terms of a modification of 
W, which involves the harmonic mean instead of the logarithmic mean. Metrics of this form 



have already been considered in [13]. A considerable part of the work in the current paper 



consists of showing that the choice of the mean is irrelevant in the limit TV — > 00. 

Let us remark that Gromov-Hausdorff convergence results such as in Theorem 1.1 can be 
used to prove convergence of gradient flows along the following lines: 

(i) Theorem LI in this paper asserts that (£P(T N ), Wat) converges to (£P(T d ), W2) in the 



sense of Gromov-Hausdorff. 

(ii) Let 7Ttv be the uniform probability measure on T N and let tt be the Lebesgue measure 
on T d . It is not difficult to see that the relative entropy functionals Ent^ on ^(T N ) 
T-converge, as N — > 00, to the relative entropy functional Ent^ on ^(T^). 

(iii) In [|| it has been proved that the functionals Ent^ are all geodesically convex on 

(iv) From [jl3| we know that the gradient flow of Ent^^ with respect to Wjv produces solutions 
to the heat flow. 

(v) These results can be combined to obtain convergence of gradient flows, since it has 
been proved in |[(]] that gradient flows of A-geodesically convex functionals on Gromov- 
Hausdorff convergent spaces are stable with respect to F-convergence. 

Of course, the convergence of the discrete heat flow to the continuous one is not a new result, 
and could also be proved directly, for instance using the explicit formulas for the heat kernels. 
Yet the argument pointed out here has the advantage of being based on discrete Ricci bounds 
and Gromov-Hausdorff convergence only, and as such this strategy can be useful also in other 
situations. For instance, in Q the stability of the heat flow in a continuous and non-smooth 
context has been used to show the stability of Ricci curvature bounds in conjunction with the 
linearity of the heat flow. Let us note that uniform geodesic A-convexity for discretizations 



of one-dimensional Fokker-Planck equations has been recently proved by Mielke [15|. 



This paper is structured as follows. In the preliminary Section ^ we recall some facts about 
the Wasserstein metric Wz and its discrete counterpart W. We also collect some mostly 
well-known properties of the heat flow that will be useful in the sequel. In Section 3.1 we 



introduce the mappings that will be used to prove the Gromov-Hausdorff convergence result, 



and we outline the strategy of the proof. Most of the actual work is done in Section 3.2, which 



contains the crucial estimates. Finally, we put all pieces together in Section |3.3| , in which we 



prove Theorem 1.1. 
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2. Preliminaries 

2.1. The 2-Wasserstein metric. Let M be a compact smooth Riemannian manifold and 
&(M) the set of Borel probability measures on it. The Wasserstein distance Wi on £P(M) is 
usually defined by minimizing the transport cost with respect to the cost function distance- 
squared. It has been emphasized by Benamou and Brenier j|] that a completely different 
introduction to the subject can be given in terms of solutions to the continuity equation. 
The following result has been proved for M = R d in Q (see also ]l7]]), the case of general 
manifolds being a consequence of Nash's embedding theorem (see also [[?], Proposition 2.5] for 
a direct proof on manifolds) . 

Proposition/Definition 2.1. Let M be a compact smooth Riemannian manifold and p,,v G 
&{M). Then we have 



Wi(fi,u) = min / f \v t \ 2 {x) dt , (2.1) 

JO J M 



'0 JM 

the minimum being taken among all distributional solutions (pt,Vt) of the continuity equation 

^-u t + V-(v m ) = 0, (2.2) 
dt 

such that £ i— >■ pt is weakly continuous in duality with C(M) and p,Q = p, p\ = v. 

In the sequel, when considering the continuous setting we will work with M being the 
d-dimensional torus T d := R d /Z rf and we will consider solutions to the continuity equation 
in terms of probability densities and momentum vector fields. To fix the ideas, we give the 
following definition. 

Definition 2.2 (Solutions to the continuity equation in the continuous torus). Consider the 
mappings [0, 1] x T d B (£, x) (->■ p t (x) G R and [0, 1] x T d (->■ V t (x) G R d . We say that (p t , V t ) 
solves the continuity equation 

±pt + V-Vt = 0, (2.3) 

provided both (t,x) i— > pt{x) and (t,x) h- > Vt(x) are in i 1 ([0, 1] xT d ), 1 1— > p t is continuous with 
respect to convergence in duality with C(T d ), and ( |2.3j ) is satisfied in the sense of distributions. 

2.2. Discrete transportation metrics. In several recent works || |TJ] discrete ana- 
logues of W2 have been considered, which are well suited to study evolution equations in a 
discrete setting. The definition of the Wasserstein distance requires a metric on the underly- 



ing space. In [ 13 1 , instead, the starting point is a Markov kernel K on the finite set X, i.e., 
we assume that K : X x X — > R + satisfies X^j/gA" K( x i v) = 1 f° r an x ^ ^ e assume that 
K is irreducible and denote the unique steady state by ir. Thus ir is the unique probability 
measure on X satisfying 

n(y) = ^2 7r ( x ) K ( x ^y) 

for all y G X. We shall assume that K is reversible, i.e., the detailed balance equations 

K(x,y)ir(x) = K(y,x)n(y) 
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hold for all x, y G X. Since basic Markov chain theory implies that ir is strictly positive, we 
can - and will - identify probability measures on X with their densities with respect to ir, 
i.e., we set 

&{X) := { p : X -> R+ | ^ n(x)p(x) = 1 } . 

In order to define the metric W on £P(X), we let 9 : R+ x R + — > R + denote the logarithmic 
mean, which is defined by 

9(s,t)= ! s l - p t p dp. 
Jo 

For p £ £P(X) and x,y £ X we set 

= 0(p(x),p(y)) , 

which can be regarded informally as being "the density p at the edge (x,y)". According to 
j|, Lemma 2.6], the following definition can be taken as one of the equivalent definitions of 
the transportation metric W on £P(X). 

Definition 2.3. Let K be an irreducible and reversible Markov kernel on a finite set X, and 
let po,pi £ ^P(X). The distance W(po,Pi) is defined by 

W(p ,Pi) 2 =inf \\ f Y, ^^-K(x,yMx) dtX , (2.4) 

where the infimum runs over all curves [0, 1] 3 1 1— > (pt, Vt) such that: 
(i) pt £ &(X) for any t £ [0, 1], the function t i— > pt(x) is continuous for any x £ X , and 
Po = Po, Pi = pi; 

(ii) Vt : X x X — > R for any t G [0, 1], and the function t t- > Vt(x, y) belongs to I/ 1 (0, 1) for 

any x, y G X; 
(Hi) the "discrete continuity equation" 

d 

dt ruy ~ J ' 2 



-M x ) + lYl ( V t( x ^) ~ Vt(y,x))K(x,y) = (2.5) 



holds for all x £ X in the sense of distributions. 

2.3. The transportation metric on the discrete torus. In this paper we shall only be 
concerned with simple random walk on the d-dimensional discrete torus T"^ := (Z/iVZ) d = 
{0, . . . , N — l} d , in which case the kernel K N : T d N x T d N ->■ [0, 1] is given by 

is i u\ _ \ 2R1 b = a±ej mod N for some i G {1, . . . , d} , 
KN[a > bj " \ 0, otherwise . 

Here, denotes the i-th unit vector. All computations in T 1 ^ will be performed modulo N 
without further mentioning. 

In this case the stationary probability measure ir^ is the uniform measure given by 7T/v(a) 

-id 

-TV- 
given by 



iV d for all a £ T'L. Therefore, the collection of probability densities with respect to 7T/v is 



T d N ) = { PN :T d N ^K + | Y PN(*) = N d } 
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For functions /, g : Tfj — > R we consider the normalized L 2 -inner product 

and the Dirichlet form 

d 

m/,<?) = t^zi E E(/( a + e ^-/( a ))(^ a+e *)-5( a )) • 



Furthermore we set 



y/(f,fh* N » £ N (f)=£ N (f,f) 
Let A 7v be the discrete Laplacian, defined by 



A N f(a) = 2dN 2 (K N - J)/(a) = TV 2 £ (/(a + e 4 ) - 2/(a) + /(a - e. 

i=l 

for a £ T^r. Notice that following integration by parts formula holds: 

£N(f,g) = -(A N f,g) L 2 N . (2.6) 

Moreover, given g : — > R, the equation A^r/ = g can be solved if and only if J2 aeT d g(a) = 
0, in which case the solution is unique. We shall use the well-known Poincare inequality on 
Tjy, which we now recall. 

Proposition 2.4 (Poincare inequality on Tfj). Let d > 1 and N > 4. For / : — > R 
with SaGT^ /( a ) = we /iawe 

% ~ 2Af 2 (l-cos(2^/iV)) <f:7v(/) ' 

£ "( A "fl " 2^ ( i_ C08( 2 7r /JV))^lli5 r • 

Proof. One way to prove the first inequality is as follows. If d = 1, then the spectrum of the 
operator / — Kn on L 2 (T^,7T7v) consists of the eigenvalues 

1 - cos(2vrn/iV) , < n < N - 1 , 

(see, e.g., f| Section 4.2]), which yields the result if d = 1. The result in dimension d > 1 
follows by tensorization (see, e.g., || Lemma 3.2]). 

The second inequality follows from the first one, using the integration by parts formula 
& □ 



Remark 2.5. In the limit N — > oo, one recovers the classical Poincare inequality on the torus 



L 2 (T d ) - ^2 H V /llL 2 (T d ) 



valid for any / with zero mean. 
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It will be useful to introduce some more notation. For a = (a\, . . . , aj) G T N we define the 
cube by 



Qa : 



a\ a% + 1 

iV' iV 



x • • • x 



g rf + 1 

Lat' AT 



C T G 



so that the torus T rf = R^/Z^ can be written as the disjoint union 

T d = U • 

aeT^ 



For i = 1, . . . , d, the facets of will be denoted by 

~a\ a\ + 1 



T>N 

^a.i- 



R 



N 
a,iH 



iV' TV 
ai ai + 1 

iV' TV 



■{£}■ 

f Qj + 1 1 

1 JV J 



.iV' JV 

Od CLd + 1 

A r ' iV 



The collection of all these facets R^ i± will be denoted by M N . For i? = R^ i± E ^ we shall 
write 

P( R a,i±) '■= 9(Qa iQa± ei ) ■ 

Notice that Kj\[(a, b) is non-zero only for a, b such that a — b = ±ej for some i = 1, . . . , d. 
Therefore we can think about the vector fields V : T N x — > R appearing in Definition 
|2,3| (ii) as being defined on facets R £ & N , rather than on generic couples a, b. This will be 
our convention from now on. 

Let Wk n denote the metric on £P(T N ) associated with the kernel Kjy according to Defi- 
nition . It will be convenient to work with the normalised metric 



NV2d 

which is a quantity of order 1. 

Given a probability density p^ E &{T d N ) and a 'momentum vector field' V/v : & N — > R, 
the action An of {pn, Vn) is defined by 



An(pn,Vn) ■-- 



1 



E 



V N {Rf 



Ad?N d + 2 p N (R) ' 



(2.7) 



With this notation and taking Definition |2.3| into account, it is immediate to obtain the 
following expression for the metric Wat. 



Lemma 2.6. For any pn,o,Pn,i G ^{^n) we ^ ave 



Wn(pn,o,Pn,\ 



inf 



A N (pN,t,V Njt ) dt 



(2- 



where the infimum runs over all curves [0, 1] 3 1 1— > (pN,t,VN,t) such that: 
(i) PN,t G -^(Tjy) for any t E [0,1], and the function t i— > pjv,t( a ) is continuous for any 

a E T^r with p Nfi = p Nfi , Pn,i = P~N,ii 
(ii) Vnj '■ & N R for any t E [0, 1], and the function t \- > Vnj{R) belongs to L l (0, 1) for 
any R E St N ; 
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(Hi) the discrete continuity equation 



^,t(a) + Y d E {vMRSm) ~ V N ,t(<i-)) = (2-9) 



dt 

i=i 



ZioWs /or aZZ a G m i/ie sense of distributions. 



By analogy with Definition |2.2| we formulate the following discrete counterpart. 

Definition 2.7 (Solutions to the continuity equation in the discrete torus). Let [0, 1] x 3 

(t,a) i ^ pjv,t(a) G R and [0,1] x & N 3 (t,R) ^ V Ntt (R) G R d . VKe say i/iat (pN,t,V Ntt ) is 
a solution to the discrete continuity equation ( |2.9| ) provided that (i), (ii) and (in) in Lemma 
\2J\ are fulfilled. 



Finally, we recall a couple of properties of Wat that will be used in the sequel. We shall 
use the metric cItv on defined by 



dAr(a,b) = — , 



d 

\ i=l 



for a,b G T^. We let 



W 2 , N (2.10) 



denote the standard 2-Wasserstein distance on <^(T^) induced by the distance cJat on T^. 
In the following result we collect some basic properties of the metric Wjv- 

Proposition 2.8. The following assertions hold. 

(i) The function (p,o~) Wfj(p, a) is convex on ^(T^) x ^(Tfj) with respect to linear 
interpolation. 

(ii) There exists a universal constant C > such that 

W N < -jfV 2 ,N ■ 

In particular, the diameter of the spaces (^(T^), Wjv) is bounded by a constant de- 
pending only on the dimension. 

Proof. The first assertion has been proved in & Proposition 2.8]. For the second assertion, 
we apply ]|, Proposition 2.12] to obtain 

where c ~ 0, 78 is a universal constant and N is the 2-Wasserstein distance on <^(T^) 
induced by the distance d' N on T d N , defined by d' N (&, b) := ^ |a,; — b»|. Since d' N (a, b) < 
VdNdN (a, b) , we have N < \' r dNW2 ) N, which implies yields the desired estimate. Since 
the diameter of the spaces (T^,djv) is uniformly bounded by a dimensional constant, the 
final assertion follows as well. □ 
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2.4. Some properties of the heat semigroup on the discrete and continuous torus. 

We endow the continuous torus T d with its natural Riemannian flat distance, and we denote 
the Lebesgue measure by tt. 

Let (H^)i>o be the heat semigroup on T d with generator A, acting either on measures or 
functions. The heat semigroup on is the semigroup generated by the discrete Laplacian 
Ajy, and will be denoted by (Hf ) t >o- 

Let fi£ be the heat kernel on T d , i.e., the density of H^(5o) with respect to it. Similarly, hf 
will denote the heat kernel on T^, which is defined by hf (a?) = Hf (iV d l {0} )(x). We thus 
have the formulas 

H t f(x) = [ h t (x - y)f(y) Mv) , Hf f N (a) = ± £ hf (a - b)f N (h) , 
jTd beT% 

valid for all L^functions / : T d -)• R and f N : T% -)■ R. 

The heat semigroup on T d acts on vector fields as well coordinatewise. Similarly, the action 
of H^ on a vector field V N : M N -> R can be defined via 

Hf Viv« i+ ) := ^ E h ^( a " b )^v« i+ ) • (2- 11 ) 
beT^ 

Given a function / : T d — > R, its Lipschitz constant will be denoted by Lip(/). Similarly, 
we define the Lipschitz constant of a function / : — > R by 

|/(a)-/(b)| 
LiPiv(/) := sup , , ■ 
a ^b Oat (a, b) 

The propositions below collect some basic properties of the heat flows that we will use in the 
sequel. 

Proposition 2.9 (Heat flow on the continuous torus). The following assertions hold for all 
s > 0. 

(i) There exist constants c(s) > and C(s) < oo such that for any p € £P(T d ) the density 
p s of H s /i satisfies 

p s > c(s) and Lip(/9 S ) < C(s) . 
Furthermore, there exists a dimensional constant C < oo such that 

W 2 (H s p,p) <C\/i. 

(ii) There exists a constant C(s) < oo such that for any f € L 1 (T d ) we have 

||H s /|| £ «»+Lip(H s /)<C7( S )||/|Ui. 

(Hi) Let (pt) C &(T d ) be a geodesic, letvt be the corresponding velocity vector fields achieving 
the minimum in fl2.1|) , and Zei /Q Sj t and V^t 6e the densities of H s (pt) and Y\ s {vtpt) 
respectively. Then, t i— > (p S) i,14,t) * s a solution to the continuity equation ( |2.3| ), and we 

/ / -^dxdt<Wf( Mo ,Mi)- (2-12) 
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Proof. The first assertions in (i) are obvious. To prove the last claim in (i), notice that 
by the convexity of W% it is sufficient to prove the claim when \x is a Dirac mass. In this 
case the result follows from the fact that the heat kernel on the torus can be represented by 
periodization of the heat kernel on R rf , and the parabolic scaling of the latter. 

The result of (ii) is standard, and (m) follows from the convexity of R rf x R + 3 (x, a) i— >■ — 
and the fact that H s is a convolution operator, see, e.g., Lemma 8.1.10 in Q. □ 

Proposition 2.10 (Heat flow on the discrete torus). The following assertions hold for s > 0. 

(i) There exists a dimensional constant C > such that for any pjy £ &(T N ) we have 

Up N (H^p N ) < miu{Cs^ d+1 ^ 2 , Up N (p N )} . 

(ii) For any pjy G and any momentum vector field Vn '■ & N — > R rf we have 

A N (H? PN , Hf V N ) < A N ( PN , V N ) . 
Proof. The estimate Lipjy(H^ p^) < Lip N (pj\f) in (i) is a simple consequence of the fact that 

2 

the heat semigroup consists of convolution operators. Taking the convexity of (x, a, b) i-> e ^ a ^ 
into account, this also gives (ii). 

To prove the remaining bound in (i), we note that for any probability density pw £ 



1 



Pjv (a) - H? p N (b)\ = w 

1 

w 



£ 

cPT d 



hf(a-c)-hf(b-c))^(c) 



< 



. cPT d 
ctx JV 



^(c) sup |hf(a-c)-hf(b-c)| 



sup 



Ihf(a-c) 



rPT d 



hf(b 



Since (a) = h 1,J 
we infer that 



(ai) • . . . • hl' N (ad), where h 1,7V denotes the heat kernel in one dimension, 



|hf (a) - hf (b)| < llh^f-^lh^K) - Ul' N (b k 



k=l 



<Vdd N (z,h) llh^ll^Lip^h^) 



and therefore 



Li PiV (Hf p N ) < Vd ||hj- 



.JViirf— 1 



(2.13) 



i£» LiPiv( h 6 

so it remains to obtain bounds on the heat kernel in one dimension. These can be obtained 
using the well-known (and easy to check) fact that, if d = 1, the spectrum of the operator 
— An consists of the eigenvalues 



I £ L 



N 



z <E Z 



N 
~2 



Xi = 2dN 2 (l - cos(2vr//iV)) 

Note that A; = A_;. The corresponding eigenvectors vi are given by 

/2-7ri/a 



+ 1 < z < 



N 

y 



vi(a) = exp 



V N 



I € L 



N 
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As a consequence, the heat kernel hl ,N can be written explicitly as 

h^(a) = Y, e- A ^z(a) • 

l€L N 

We shall use the fact that there exists a constant c > such that for all N > 1 and I G Ljy, 

| A/ 1 > c/ 2 , |bz||z,°° < 1 , and Lip N (vi) < cl . 
It follows that for some constant C > and all a, b £ , 

|h^(a)|< ^ e- A;S |^(a)| < £ e"^ < , 

|h^(a)-h^(b)| < £ e- A ' s |^(a)-^(b)| < C7 ]T Ze^Ma, b) < ydjv(a,b) , 

so that ||hs ,Ar ||Loo < Cs -1 / 2 and Lip JV (hs' Ar ) < Cs~ l . Plugging these estimates into (|2.13| ), 
we obtain the desired result. □ 

3. Proof of the main result 

3.1. Ingredients and structure of the proof. In order to prove the stated Gromov- 
Hausdorff convergence of the spaces (^(T^), WV), we will introduce the natural mappings 
from the continuous torus to the discrete one, and those going the other way around. 

First we construct discrete measures by integration over cubes, and discrete vector fields 
by integration over facets: 

Definition 3.1 (From T d to T^). Given a probability measure p G £P(T d ) and N G N the 
probability density Vn{ij) G ^(1%-) is defined as 

V N (ji){*) :=N d p(Q»). 

Similarly, given a continuous momentum vector field V = (Vl, . . . , Vd) '■ T d — > R d we define 
V N {V) :M N -+Rby 



V N {V){R) := 2dN d I Vi{x) dx , R = R* i± G 
Jr 



IR 

Probability densities on are defined by piecewise constant extensions of densities on 
T^r, and vector fields on T d are defined by linear interpolation. 

Definition 3.2 (From to T d ). Given a probability density p N G ^(T^) and a momentum 
vector field V N : ffl — > H d , the probability measure Qn(p N )^ G ^ > (T d ) and the momentum 
vector field Q N (V N ) : T d -> K d are defined as 

Q N (p N )(x) :=N- d p N ( a ), 
Q N (V N Ux) := ((1 - Nx t + ai )V N (R^_) + (N Xl - ai )V N (R» i+ ) 
where a = (ai, . . . , a^) G is uniquely determined by the condition x = (x\, . . . , xj) G Q 



N 



The maps Vn, Qn will be the ones that we use to prove Gromov-Hausdorff convergence. 
They are constructed in such a way that ensures that solutions of the continuity equation are 
mapped to solutions of the continuity equation. 
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Proposition 3.3. The following assertions hold: 

(1) Let (pt,Vt) be a solution to the continuity equation ( |2.3| ) such that the mapping x i— >■ 
Vt(x) is continuous for almost every t. Then (T J N(pt),'PN(Vt)) solves the discrete 
continuity equation (|27 



(2) Vice versa, let (pN,t,VN,t) be a solution to the discrete continuity equation (2.9). Then 
(QN(PN,t)iQN(V~N,t)) solves the continuity equation fl2.3| ). 



Proof. These statements are direct consequences of the definitions and the Gauss-Green The- 
orem. □ 

It follows from the definitions that Vn o is the identity operator on ^(T^). On the 
other hand, Qn ° Vn is a good approximation of the identity in the following sense. 

Lemma 3.4. For all p G &(T d ) and all N > 2 we have 

W 2 (Q N (V N (p)),p)<^ . (3.1) 



Proof. Since both measures agree on each cube , it follows that 



N\2 



W 2 (Q N (V N (p)),p) 2 < KQa)diam(Q 

Taking into account that the diameter of each equals y/d/N, the result follows. □ 

The following simple result allows us to compare the 2-Wasserstein distances on ^(T d ) 
and 3 s {T%). Recall that W 2 ,n has been defined in (|2~10| ). 

Lemma 3.5. For all poipi £ ^(T d ) we have 

W 2 ,n{V n {po),V n {pi)) < V2W 2 (p ,Pi) + 



2d 



N 

Proof. Define T N : T d -> T% by T N (x) := a whenever x G Q% . Since \(T N x)i - (T N y)i\ < 
1 + N\x{ — yi\ for x, y G T d , we have 

d N (T N x,T N y) < \x -y\ + — ^ . 

Using the fact that Vn(p>i) = (Tn)#Pi, the result follows. □ 

In order to carry out our estimates, we will sometimes need some regularity on the proba- 
bility densities involved. For this reason, we introduce the following set. 

Definition 3.6 (Regular densities). Let 5 > 0. Then the set ^s{T d N ) C ^(T d N ) is the set 
of probability densities p^ G ^(T^) such that 

min pAr(a) > 5 , Lip N (p N ) < S^ 1 . 

a£T^ 

Notice that the projections Vn preserve this sort of regularity, i.e., 

Lip N {V N {p)) < Lip(p) , min P N (p)(a) > inf p(x) , (3.2) 

as is readily checked from the definitions. 

The set ^s^%) is endowed with the following distance, which is obtained by minimizing 
the action functional over all paths in the space of regular densities. 
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Definition 3.7 (The distance Wjv,<j)- Let 5 > and Pn,0iPn,i £ &s(T%)- The distance 
Wjv,j(piv,0) Pat,i) defined as 

(Wn,s(pn,o, Pn,i)) 2 ■■= inf | ^ AN(pN,t, V Ntt ) dt\ , 

i/ie infimum being taken among all solutions (pjv,t, VjV,t) of the continuity equation ( |2.9| ) suc/i 
too* pjv.t e £»s(T&) /or any t G [0, 1]. 

The last tool that we need is a variant of the distance Wtv on ^(T^), where instead of 
the logarithmic mean 6 one considers the harmonic mean 8 given by 

6 ^ b) := —b 

for any a, b > 0. If a = or b = 0, we set 0(a, 6) = 0. For pat € ^(T^) and i? = G ^ 

we put 

p N (R) ■.= d(p N (a),p N (a + e i )) . 

Definition 3.8 (The distance Wat). For pn,o,Pn,i G ^(T 1 ^), i/te metric Wn(pn,o, Pn,i) is 
defined as 

( * ,m * wjl|! := inf { f E, dt } • 

the infimum being taken among all solutions (pN,t^N.t) of the continuity equation ( |2.9D . 

Distances of this form have already been introduced in Notice that since 6(a,b) < 

6 (a, b) for any a, b > 0, it follows immediately that Wn > WV. 



Let us now describe our strategy to prove Theorem 1.1. We start with two measures 
Po, pi £ ^(T d ), regularize them a bit using the heat flow for a short time s > 0, and then 
show (Proposition |3.10|) that for some constant C(s) < oo (independent on po, pi) we have 

W J v(P A r(H s (p )),P A r(H s (p 1 ))) < W 3 (Ato,/ii) + ^ • 

v 

This will follow quite easily. The converse inequality will be harder to achieve, as the nat- 
ural inequality that one obtains for pn,o,Pn,i G & > {T%) (in Proposition 3.11 ) involves the 
harmonic mean rather than the logarithmic mean, i.e., we prove that 

W 2 (Q N (p»),Q N (p?)) < W N (p$,p?) . 

Thus the problem becomes to bound Wn from above in terms of Wat plus a small error. 
Unfortunately, the harmonic-logarithmic mean inequality 9(a,b) < 6(a,b) goes in the 'wrong' 
direction, but the elementary inequality 

1 1 „ {b-a) 2 1 



0(a,b) 9(a,b) ~ ab 9{a,b) 
that we establish in Proposition |3.12| , allows us to obtain an estimate for all regular densities, 



i.e., 

i 
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for p^,pf e&siTff), 

Thus at the end everything reduces to prove that Wn,5 can be bounded above, up to a small 
error, by Wjy. Clearly, this is false without some additional assumptions on the measures 
we want to interpolate. The idea is then to notice that the measures on the discrete torus 
that we produced in our first step, using Vn after an application of the heat flow, belong to 
&s(T N ) f° r some 5 > 0. We then show in Proposition 3.13] , which is technically the most 
involved, that given e, 6 > 0, there exists d > such that the bound 

Wn,s(PN,0,PN,i) < y^N(PN,0,PN,l) +£ 

holds for any Pn,0iPn,i £ ^(T 1 ^). This will be enough to complete the argument. 

3.2. Estimates. Here we collect all the estimates that we need to implement the strategy 
outlined above. We start by observing the effect of Vn on the action of vector fields. 

Lemma 3.9. Let p = pn G £?(T d ) be a probability measure and V : T d — > R rf a momentum 
vector field. Assume that both p and V are Lipschitz and that minp > 0. Put p N := Vn(p) 
and V N := Vn(V). Then there exists a universal constant C > 0, such that for any N > 1 
we have the bound 

^.v") < f g^ dl+ gj( imu~ii P QO + \wfM) \ . ( 3. 3) 

J T d p{x) iV \ mmp (mmp) z J 

2 

Proof. We apply Jensen's inequality to the convex function (x, y, z) h-> -nf- — r to obtain for 



R = R»e& N , 



V N (R) 2 2 {f R Vi(r)dr 



2 



2d 2 Nd+ 2 ^ {R) N 2s Jr f y N ^ _ ^ ^ ^ ^ f y N ^ + hei) dh ^ 



r jo rv '"^D ^' ) jr jo 

< 2/ ,* -d ft d r < 3 - 4 > 

/ 0{p(r - hei) , p(r + hei)) 
1 ■ 2 



N \YMi dhdr . 

IrJ-± 9(p(r-hei) , p(r + he^) 
Using the elementary fact that for x, x G R and y > y > 0, 



.2 ™2 



y v 

we obtain for r G i? and |/i| < 

|^(r)| 2 |^(r + /ie 



X + X \. _, X 



2 



< r X - X + tt J/ 



>(p(r - /iei) , p(r + /le^)) p(r + he t 



< c/ ||^llL°°Li P (y) | ||y||iocLi P (p) \ 

— iV \ minp (minp) 2 / 



for some universal constant C > 0. Combining this bound with ([3,4j), and summing over all 
ii G @ N the result follows. □ 

The previous result can be used to obtain the following lower bound for the Wasserstein 
metric Wi- 
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Proposition 3.10. Let s > 0. There exists a dimensional constant C(s) < oo such that for 
all probability measures //o>A*i £ ^(T d ) and for all N > 1 u;e /icwe 

Wiv(^Jv(H s (/i )),Piv(H s (/ii))) < W 2 (Ato,/*i) + ^ 



Proof. Let (/^) be a constant speed geodesic connecting po to p\ in (&(T d ), W 2 ), and let (ft) 
denote the corresponding velocity vector field achieving the minimum in fl2.1| ). For s > 0, let 
p S) t and V^i be the densities with respect to it of H s (/it) and H s (ut/it) respectively. According 
to («;) of Propositi on |2.9| , for given s > 0, the curve i h-> (/3 S) t, V s> t) is a solution to the 
continuity equation (|2.3j ) and we have 

- 1 /" l^)| 2 
h d Ps,t( x ) 



dt dx<W%{p ,pi) 



(3.5) 



By (i) and (zi) of Proposition |2lj we also know that there exists constants c(s) > and 
C(s) < oo such that for all t G [0, 1], 

inf p Sjt (x) > c(s) , Lip(p a , t ) < C(«) , H^tllL- +Lip(y M ) < C(s)||V; / 2 j tlUi ■ (3.6) 

Set t h-> r/7v,t := 'Pat (H s (/it)) and i h-» PFjv,t := T'jvC^t). By Proposition [O] the curve 
(,VN,t> Wjv,t) solves the continuity equation (|2.9|). Applying Lemma pT9| , (3.6) and (|3.5[), we 
obtain for some (different) dimensional constant C(s) < oo, 

W 7 v(PA f (H s (/i )),PAr(H s ( W ))) 2 

< / A N (VN,t, W N>t ) dt 



< 



|Vm(x)| : 



dx + . 
T d p s ,i(x) A 

l 



Cd { \\V s j\\ L ™Lip(V Sit ) \\V S} t\\ 2 L aoUp(p s ^) 



+ 



nun p s .t 



(min p,^) 2 



dt 



<^(p ,Pl) + 



C{s) 
N 



o 



Applying the Cauchy-Schwarz inequality in the form 



< 



IW^)p 

ld Ps/2,t(x) 



dx 



together with (3.5), we obtain 



W7v(^(H s (/io)),^7v(H s ( w ))) 2 < W 2 (p , P i) 2 + 1 



dx dt 



\Vs/2,t( X )\' 
N Jo Jt<> Ps/2,t( X ) 

<W 2 (po,pi) 2 + ^P-W 2 (p , Pl ) 2 . 

Taking into account that (£P(T d ), W 2 ) has finite diameter, we obtain the the result by taking 
square roots and using that \/a + b < y/a + Vb. □ 

The next result provides a lower bound for W 2 . Recall that Wn is defined using the 
harmonic mean instead of the logarithmic mean. 



Proposition 3.11. Let N > 1 and pfi ,pf G ^(T 



N ■ 



Then 



W 2 (Q n { P %),Qn(p 1 ?)) < Wn(p8,P?) ■ 



JV JV> 



(3.7) 
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Proof. Let t (ft , V t ) be a solution to the continuity equation fl2.9| ). Define ft := Sjv(pt ) 
and Vt := Qa^V/ 7 ). Then, for every £ E [0, 1] we have 



|V f Qr)| ; 
T d p t (x) 

1 



drc 



E 



aeT* 



Pt( x ) 



dx 



tE 



a, i rt v ' 



AT 



l — Nxi + a i N N Nxi — a>i N N 

V t (/Vi~JH TTTICt — V t 



2dN 



2dN 



dxi 



< 



4d 2 N d + 
1 

4d 2 N d + 2 
1 

4d 2 iV d + 2 
1 

Ad?N d + 2 



a. i 



2^(a) 



T/JV/piV \2 



+ 



p*(a) ^(a + et) 



E 



Since from Proposition [y^ we know that t i— >■ (ft, Vt) solves the continuity equation, we obtain 



W|(p 0) ft) < 



iT d 



|V f (x)p 
Pt( x ) 



dx dt < 



1 



4d 2 iV d + 2 



E 



Pf(^) 



Taking the infimum over all the solutions (ft > V t ) °f (|2.9[) and recalling the Definition ^8 
of Wat we get the result. □ 

For regular densities, the following result compares the distances defined using the harmonic 
and the logarithmic means. Note that the reverse inequality Wn < Wn follows directly from 
the harmonic-logarithmic mean inequality. 

Proposition 3.12. Let 5 > 0, N > 5~ 2 and p{? G ^ 5 {T d N ). Then the following estimate 
holds: 

i 

1 



<5 4 iV 2 



(3.8) 



Proof. Let 6 > a > and, as before, let 9(a,b) :— 
((1 — t)a + t&) -1 and notice that 

7(t) dt, 



T — r be the harmonic mean. Set f(t) 



6(a,b) 



1 



(/(0) + /(!))• 



6(a,b) 2 

Since / is convex and non-increasing, we obtain 

JT^ ~ W~h\ =\f ~ f(t) dt + lf 1 /(l) - f{t) dt 
9(a,b) 9{a,b) 2 J 2 J 

b-af 1 



< 5 ( /'(I) - /'(O) 



6 2 



afr 0(a,6) 
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Therefore, for p N G &>s{T%) and R G M N we have 

1 ( 1 \ 1 1 



< 1 



p N {R) ~ \ 5*N 2 J p N (R) ' 
and the result follows applying this inequality along a geodesic in (^(T^), Wn s) connecting 

p% to p? . □ 

The final proposition in this subsection shows that regular densities can be connected by 
a curve consisting of (a bit less) regular densities, for which the action functional is almost 
optimal. 

Proposition 3.13. Let e, S G (0, 1). Then there exists 5 > such that for any N > 4 and 

PN,o,Pn,i ^ ^(Tjy), we have the bound 

y^N,s(PNfi,PN,l) < Wn(PN,0, PN,l) + £ ■ (3.9) 

Proof. Let a, b G (0,(5) to be fixed later and i h-» (pN,t,VN,t) be a WAr-geodesic connecting 
PN,o to Pn,i- Define the curves t (->■ (/9^ f , V^) and t (-)■ (p^ t , V$ )t ) by 

^ := (1 - a)pjv,t + a , V& )f := (1 - a)V N , t , (3.10) 

PN,t ■= "b(pkt) > Vft, == H^(V^, t ) . (3.11) 



The latter expression should be interpreted in the sense of (2.11). 
Step 1: From p^j to p x N ■ for j = 0, 1. 

For j = 0, 1, we define s i-> ?7Ar, s ,j as the linear interpolation between p^j and Pj^j, i.e., 

T}N,s,j(&) '■= (1 - s)/°Afj( a ) + *PiV,i( a ) = PiV,i( a ) + s a{l ~ PN,j(&)) ■ 

Notice that since X^aeT^ 1 ~~ PN,j(&) = 0, it makes sense to define 

WW<i±) :=T2adiV 2 (A^(l- mi )(a±e.O- A^(l- mj )(a)) , 

with 1 being the density constantly equal to one. A direct computation shows that s *— > 
(VN,s,j, Wn, s j) is a solution to the continuity equation (^S|). Notice that actually Wn iS j does 
not depend on s. Taking into account that 

VN,.j{a)>a, a€T d N , sG [0,1] , j = 0, 1 , (3.12) 



recalling the Poincare inequality (Proposition 2.4), and using the trivial bound 

£jv(1 - PN,j) < d(Lip N (p N j)) 2 < d5~ 2 , 
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we obtain 



N d ~ 2 



^W^2 E Ef^d-^Ka + eiJ-A^Cl-p^Ka] 



a cT d i=l 



aS N (A^(l-p Nd )) (3 - 13) 



II I I o 

- ^2 £ N(1 ~ pN,j) 

ad 



k 2 5 2 ' 

where k := infAr> 4 2iV 2 (l — cos(27r/iV)) > 0. Notice also that 

Lip N (vN,s,j) < Up N (p N>j ) < cT 1 , se [0, 1] , j = 0, 1 . (3.14) 
Step 2: From p x N j to pj^ for j = 0, 1. 

For j = 0, 1 we interpolate from py ■ and p^- ■ using the heat flow, i.e., we define s 1— > 

ji ZN,s,j )by 

a Nj3) j(a) := H sb (p N}j ) , 
z N,s,j( R *,i±) : = T2bdN 2 (a N , sJ (8L±e i ) - a N , sJ {&)) . 

We then obtain 

An{^n,s,3, Zn,s,j) 



4d 2 N d + 2 ^ ^ a Ns AR N -A 



6 2 (<Tjv )S ,i(a + e») - o"jv, 8 j(a)) 



2 



]yd=2 E (^^-(a + e «) ~ °Xsj( a )) ( ^(^sjCa + e»)) - log(cTAT, S j(a))) 



& 2 £jV (<JN,sj, log(o"iV,sj)) 



In view of Proposition 2,10| (i) we obtain by construction, 

ffj\r,»j(a) > a , a G T^, s G [0, 1], j = 0, 1 , (3.15) 

Lip^(ajv, s ,i) < LiPTV^kj) < £ _1 , [0, 1], J = 0, 1 . (3.16) 

Hence Lip^(log(<rjv iSij )) < ^sfe^ ^ 5 ~ 2 - Since |£aK/>s)I < dLi Pjv (/) Li Pjv ( 5 ) we obtain 



d6 2 

•A.N(cN,s,j, Zn, s ,j) < -p- ■ (3-17) 



Step 3: From p 2 NQ to p 2 ^. 
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2 

From the convexity of the function (x, a, b) h-> e * a ^ we get 

■AN(PN,t> V N,t) < (! - a)A/v(PAr,t> Viv.t) = (1 - a)W N (pn,o, Pn,i) 2 , 

2 

for any t G [0, 1]. Using again the convexity of (x, a, b) i— >■ and the fact that H acts as a 
convolution semigroup, we also get 

An(Pn,u Vh) ^ a n(pn^ Vh,t) 
for any t G [0, 1]. Combining these two inequalities and integrating we get 



A N (p\ t , VN,t) d* < / ^(pjv, t) Vh,t) < (1 - a)Wjv(Piv 



Since the heat semigroup preserves positivity, we obtain 

p^(a)>a, aeTl,te[0,l], 
and by (i) of Proposition 2. 10| we have 

Up N ( P >) < Cb-^' 2 , te [0,1] 



(3.18) 

(3.19) 
(3.20) 



for some universal constant C > 0. 
Step 4: Gluing the pieces. 

Let £ G (0, 1/4) to be fixed later. We define the curve t H> {p% t ^Nt) on [0> 1] ky gluing 
the pieces together, that is, 



(PN,t,Vl t ) ■-- 



{P N t-2l 
' 1-41 



(a 



i-l-t 



W Ni ) 
^7 



' 1-4,1 



t€ [0,£] , 

t G (£, 2£) , 

t G [2£, l-2£] , 

t G (1 - 2£, 1 - £) 

t e [l - 4 1] . 



Clearly, /j ^ (Pat,,V# t ) is a solution to the continuity equation (fOj). From ( |3.13| ), ( |3.17[ ) 
and ( 3.18 ) we get, taking the scaling factors into account, 
-l 



2ad 



+ 



2db l 



+ 



1 - a 



£k 2 8 2 £6 3 1 



Wn(pn,o,Pn,i) ■ 



It remains to fix the constants a,b G (0,5) and £ G (0, 1/4) as functions of 5 and e. From 
(ii) of Proposition we know that the diameter of (^(T^-), WV) is bounded by a constant 
I? > depending only on d. Choose now £ > so small that < 1 + 3752") and then a, 6 > 
so small that 

2ad e 2 2db 2 e 2 

£k 2 5 2 ~ "3 ' ~W ~ ~3 ' 

With these choices we get 

-I 







,o,Pn,i) ■ 



(3.21) 



Furthermore, the inequalities ( |3.12[ ), ( |3.15| ), and ( |3.19| ) and the inequalities ( 3.14| ), ( |3.16[ ) and 
( |372C| ) imply that 

minp3 > a , Up N (p 3 ) < maxjr 1 , C&-( d+1 )/ 2 } , 
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hence p^ Nt belongs to ^^(Tf^) for some 5 depending on a, b and 5. The result follows in view 
of Definition |?7| of W N s . □ 



3.3. Wrap up and conclusion of the argument. Finally we shall prove Theorem 1.1. Let 
us first recall one of the equivalent characterisations of Gromov-Hausdorff convergence, which 
we formulate here as a definition. We refer to, e.g., [18, Definition 27.6 and (27.4)]) for more 
details. 

Definition 3.14 (Gromov-Hausdorff Convergence). We say that a sequence of compact met- 
ric spaces (X n ,d n ) converges in the sense of Gromov-Hausdorff to a compact metric space 
(X, d), if there exists a sequence of maps f n : X — > X n which are 
(i) En-isometric, i.e., for all x,y £ X, 

\dn(fn{x),f n (y)) - d(x, y)\ < £ n ; and 
(ii) E n -surjective, i.e., for all x n E X n there exists x G X with 

d(f n (x),x n ) < E n , 

for some sequence E n — > 0. 



Now we are ready to prove our main result Theorem 1.1, which we restate for the conve- 
nience of the reader. 

Theorem. Let d > 1. Then the metric spaces (^(Tjy), Wjv) converge to (^(T d ), W 2 ) in 
the sense of Gromov-Hausdorff as N — >■ 00. 

Proof. For s > and N > 1 we consider the map from &(T d ) to ^(Tff) given by 

fj, (->■ V N (h\ s fi) . 

We claim that for each s > there exists N(s) > 1 such that for all N > N(s) this map is 
both e(s)-isometric and e(s)-surjective, for some sequence e(s) 1 as s I 0. This suffices to 
prove the theorem. 



e(s)-isometry. Let /j,q,hi G £P(T d ). Part (i) of Proposition 2J; in conjunction with ( |3. 2|) 
yields that Vn^s^o) and Vni^s^l) belong to ^( S )(T^) for some 5(s) > and for any 
N > 1. Let rj > 0. From Proposition 3.13 we then get the existence of 5(rj, s) > such that 

From Proposition 3.12| we infer that 

W N {V N (H sf i ),V N (H s vi)) <(l- jr ]y N2 ^j 2 ^N,S( v ,s)('PN(Hs^),V N (H s fi l )) , 

and then from Proposition p.ll| that 

W 2 {Q N (V N (H s ^)),Q N (V N (H sf i 1 ))) <Wjv(^Jv(H s /i ),Pjv(H s /ii)) . 



Lemma ^4 and Proposition |2.9| (i) yield 

w 2 (jm, mi) < w 2 {Q N (r N (^o)), Q N (r N (H s ^))) + 2cy^ + 2^ . 

Combining these four inequalities, we obtain 
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On the other hand, Proposition |3.10 grants that 



Wjv(7>iv(H fl /io),7>iv(H s /ii)) < W 2 (pL ,fii) + 



C(s) 



N 



Taking Proposition |2.8| (ii) into account, the latter two inequalities yield that for N = N(s) 
sufficiently large and r/ = r/(s) sufficiently small, we have for all N > N(s), 

for some e(s) 1 as s I 0. 



n 



e(s)-surjectivity. Let p N G &(T%) and set := H s Qn(p )■ Then, for some dimensional 
constant C < oo which may change from line to line, we obtain using Proposition |2.8| ( 
Lemma 3.5, and Proposition |2.9| (i), 

W N (p N ,V N (p?)) = W N {V N (Q N (p N )),V N (p»)) 

< cw 2 , N (v N (Q N (p N )),r N (pf)) 



<CW 2 {Q N (p N ),p?) + 



c_ 

N 



Taking, say, N 
proof. 



1/y/s, we infer that Vn H s is 2Cy / s-surjective, which completes the 

□ 
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